Reading Distinction in MT
نویسنده
چکیده
In any system for Natural Language Processing having a dictionary, the question arises as to whiclh entries are included in it. In this paper, I address the subquestion as to whether a lexical unit having two senses should be considered ambiguous or vague with respect to them. The inadequacy of some common strategies to answer this question in Machine Translation (MT) systems is shown. From a semantic conjecture, tests are developed that are argued to give more consistent and theoretically well-founded results. 1 I n t r o d u c t i o n In any system for Natural Language Processing having a dictionary, the question arises which entries are included in it. In this paper, I will assume the environment of a mnltilingual MT system based on a linguistic analysis and transfer architecture, from which I will derive some argumentat ion. The question which entries are included in the dictionary should be answered in two parts. First there is a mapping from graphic words to lexical units (lu's), then a mapping from lu's to readings, each of which is represented in an entry. The former mapping represents a certain level of analysis of the graphic word. It abstracts away from inflection and spelling variation, and, depending on the system's analysis component, may do so as well for productive *I would like to thank my colleagues at the university and in Eurotra, especially Louis des Tombe and Henk Verkuyl, for their helpful comments. derivation and compounding, and multi-word units. In this paper I will concentrate on the lat ter mapping, reading distinction, in a way that does not appeal to a particular choice on the relation between lu and graphic word. A consistent approach to reading distinction is necessary, because inconsistencies in reading distinction in an MT system will complica.te transfer components between a pair of languages, and jeopardize extensibility of the system. A correct solution will save time in development and improve performance. The central question in this area can be formulated as in (1). (1) Given an lu X and two of its senses ocl and 5'2, is X ambiguous or vague with respect to S1 and $2 ? In (1) a sense of an lu is the meaning the 1ll has in a certain set of contexts. If the lu is vague, both senses are covered by the same reading. If it is ambiguous, S1 and $2 are examples of different readings of the lu, each reading being represented in a single entry. 2 S o m e c o m m o n m e t h o d s Since every reading distinction creates lexical ambiguity that has to be solved, it seems attractive to use the features expressing the relevant information as criterion for answering (1): An lu is ambiguous between $1 and $2 iff there is a feature describing the dilference. If we only take morphological and sylltactic features, many intuitively clear cases of ambiguity (e.g. bank as financial institution vs. as
منابع مشابه
Reading Distinction in MT
In any system for Natural Language Processing having a dictionary, the question arises as to whiclh entries are included in it. In this paper, I address the subquestion as to whether a lexical unit having two senses should be considered ambiguous or vague with respect to them. The inadequacy of some common strategies to answer this question in Machine Translation (MT) systems is shown. From a s...
متن کاملContrast responsivity in MT+ correlates with phonological awareness and reading measures in children
There are several independent sets of findings concerning the neural basis of reading. One set demonstrates a powerful relationship between phonological processing and reading skills. Another set reveals a relationship between visual responses in the motion pathways and reading skills. It is widely assumed that these two findings are unrelated. We tested the hypothesis that phonological awarene...
متن کاملAn intronic open reading frame was released from one of group II introns in the mitochondrial genome of the haptophyte Chrysochromulina sp. NIES-1333
Mitochondrial (mt) genome sequences, which often bear introns, have been sampled from phylogenetically diverse eukaryotes. Thus, we can anticipate novel insights into intron evolution from previously unstudied mt genomes. We here investigated the origins and evolution of three introns in the mt genome of the haptophyte Chrysochromulina sp. NIES-1333, which was sequenced completely in this study...
متن کاملEvaluation Method for Determining Groups of Users Who Find MT “Useful”
This paper describes an evaluation experiment designed to determine groups of subjects who prefer reading MT outputs to reading the original text. Our approach can be applied to any language pairs, but we will explain the methodology by taking English to Japanese translation as an example. In the case of E-J MT, it can be assumed that main users are Japanese and that most of them have some know...
متن کاملAbnormal Visual Motion Processing Is Not a Cause of Dyslexia
UNLABELLED Developmental dyslexia is a reading disorder, yet deficits also manifest in the magnocellular-dominated dorsal visual system. Uncertainty about whether visual deficits are causal or consequential to reading disability encumbers accurate identification and appropriate treatment of this common learning disability. Using fMRI, we demonstrate in typical readers a relationship between rea...
متن کامل